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Abstract 

A search is performed for a vector-like heavy T quark that is produced in pairs and 
that decays to a top quark and a Higgs boson. The data analysed correspond to an in¬ 
tegrated luminosity of 19.7 fb 1 collected with the CMS detector in proton-proton col¬ 
lisions at y/s = 8 TeV. For T quarks with large mass values the top quarks and Higgs 
bosons can have significant Lorentz boosts, so that their individual decay products 
often overlap and merge. Methods are applied to resolve the substructure of such 
merged jets. Upper limits on the production cross section of a T quark with mass be¬ 
tween 500 and 1000GeV/c 2 are derived. If the T quark decays exclusively to tH, the 
observed (expected) lower limit on the mass of the T quark is 745 (773) GeV/c 2 at 95% 
confidence level. For the first time an algorithm is used for tagging boosted Higgs 
bosons that is based on a combination of jet substructure information and b tagging. 
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1 Introduction 

The discovery of a Higgs boson with a mass of 125 GeV/c 2 HUE) motivates the search for exotic 
states involving the newly discovered particle. The mechanism that stabilizes the mass of the 
Higgs particle is not entirely clear and could be explained by little Higgs models |j3]|4[, models 
with extra dimensions J5l HI, and composite Higgs models |5HZI- These theories predict the 
existence of heavy vector-like quarks that may decay into top quarks and Higgs bosons. This 
article presents a search for exotic resonances decaying into Higgs bosons and top quarks. A 
model of vector-like T quarks with charge 2/3 e, which are produced in pairs by the strong 
interaction, is used as a benchmark for this analysis. 

The left-handed and right-handed components of vector-like quarks transform in the same 
way under the standard model (SM) symmetry group SU( 3) c x SU(2)l x (,/(! jy. This allows 
direct mass terms in the Lagrangian of the form mipip that do not violate gauge invariance. As 
a consequence, vector-like quarks do not acquire their mass via Yukawa couplings, in contrast 
to the other quark families. A fourth generation of chiral fermions, replicating one of the three 
generations of the SM with identical quantum numbers, is disfavoured by electroweak fits 
within the framework of the SM (8|. This is because of the large modifications to the Higgs 
production cross sections and branching fractions, if a single SM-like Higgs doublet is assumed. 
Vector-like heavy quarks are not similarly constrained by the measurements of the Higgs boson 
properties Q. 

Vector-like T quarks can decay into three different final states: tH, tZ, and bW [j9j. The as¬ 
sumption of decays with 100% branching fraction (B) has been used in various searches by the 
ATLAS and CMS collaborations lflQl4T3l . Other searches that do not make specific assumptions 
on the branching fractions have also been performed IIT4l . In the present analysis the event 
selection is optimized to be sensitive to exclusive T quark decays to tH. In addition, the results 
are quoted as a function of the branching fractions to the three decay modes: tH, tZ, and bW. 

While searches for T quarks have been performed in leptonic final states HT0] - fL4A . this article 
presents the first analysis that exploits the all-hadronic final state in the search for vector-like 
quarks. In the SM the Higgs boson decays predominantly into b quark pairs with a branching 
fraction of 58% for a mass of 125 GeV/c 2 , while the top quark decays almost exclusively into a 
bottom quark and a W boson, which in turn decays hadronically 67.6% of the time. The main 
final state is therefore the all-hadronic final state T — > tH — > (b//) (bb), where j denotes the 
light-flavour jets of the W boson decay and b denotes the b-flavour jets from the top quark 
or Higgs boson decays. For sufficiently large T quark mass values, the decay products can 
be highly Lorentz-boosted, leading to final states with overlapping and merged jets. In the 
extreme case, all top quark decay products are merged into a single jet. A similar topology may 
arise for the Higgs boson decaying into b quarks. A related analysis concept has been proposed 
in Ref. fl5l . In recent years, the methodology of jet substructure analysis has proved to be very 
powerful in resolving such boosted topologies I[l6] - fl9| . For example, the analysis of high-mass 
Z' resonances decaying into top quark pairs became feasible in the all-hadronic final state as 
a result of the application of jet substructure methods Il20ti22l . A similar strategy is followed 
in this analysis by applying algorithms for the identification of boosted top quarks (t tagging) 
and boosted Higgs bosons (H tagging) in combination with algorithms for the identification of 
b quark jets (b tagging). In particular, the application of b tagging in subjets has enhanced the 
identification of boosted bb final states, for instance H — > bb decays. This is the first analysis 
to apply an algorithm for tagging boosted Higgs bosons that is based on a combination of jet 
substructure information and b tagging. 
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3 Event samples 


2 The CMS detector 

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diam¬ 
eter. Within the superconducting solenoid volume are a silicon pixel and strip tracker, a lead 
tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron cal¬ 
orimeter (HCAL), each composed of a barrel and two endcap sections. Muons are measured in 
gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. Extensive 
forward calorimetry complements the coverage provided by the barrel and endcap detectors. 

The energy resolution for photons with Ej~60GeV varies between 1.1 and 2.6% over the solid 
angle of the ECAL barrel, and from 2.2 to 5% in the endcaps. The HCAL, when combined with 
the ECAL, measures jets with a resolution A E/E ~ 100%/ \J E [GeV] © 5% Il23l . 

In the region |//| < 1.74, the HCAL cells have widths of 0.087 in rj and 0.087 in azimuth (<p). In 
the i)-(p plane, and for \fj\ < 1.48, the HCAL cells map on to 5 x 5 ECAL crystal arrays to form 
calorimeter towers projecting radially outwards from close to the nominal interaction point. 
At larger values of \rj\, the size of the towers increases and the matching ECAL arrays contain 
fewer crystals. Within each tower, the energy deposits in ECAL and HCAL cells are summed to 
define the calorimeter tower energies, subsequently used to provide the energies and directions 
of hadronic jets. 

The silicon tracker measures charged particles within the pseudorapidity range // < 2.5. It 
consists of 1440 silicon pixel and 15148 silicon strip detector modules and is located in the 3.8 T 
field of the superconducting solenoid. Lor nonisolated particles of 1 < pj < 10 GeV/c and 
|;/| < 1.4, the track resolutions are typically 1.5% in pj and 25-90 (45-150) pm in the transverse 
(longitudinal) impact parameter |24|. 

A more detailed description of the CMS detector, together with a definition of the coordinate 
system used and the relevant kinematic variables, can be found in Ref. Il25l . 


3 Event samples 

The data used for this analysis were collected by the CMS experiment using pp collisions pro¬ 
vided by the CERN LHC with a centre-of-mass energy of 8 TeV, and correspond to an integrated 
luminosity of 19.7 fb” 1 . Events are selected online by a trigger algorithm that requires Hj, the 
scalar sum of the transverse momenta of reconstructed jets in the detector, to be greater than 
750 GeV/c. The online Hj is calculated from calorimeter jets with pj > 40 GeV/c. Calorime¬ 
ter jets are reconstructed from the energy deposits in the calorimeter towers, clustered by the 
anti-Jcj algorithm [26: i27j with a size parameter of 0.5. 

Simulated samples are used to determine signal selection efficiencies as well as the background 
contribution from tt plus jets, ttH, and hadronically decaying W/Z plus b jet production. The 
background from QCD multijet production is derived from data. 

Events from T quark decays are generated for mass hypotheses between 500 and 1000 GeV/c 2 
in steps of 100 GeV/c 2 . The inclusive cross sections for the signal samples and tt samples are 
calculated at next-to-next-to-leading order (NNLO) for the reaction gg —> tt + X. The fixed or¬ 
der calculations are supplemented with soft-gluon resummation with next-to-next-to-leading 
logarithmic accuracy |[28| . The tt cross sections are computed based on the TOP++ v2.0 im¬ 
plementation using the MSTW2008nnlo68cl parton distribution functions (PDL) and the 5.9.0 
version of LHAPDL [28, 29i|. The evaluated tt cross section is 252.9 pb, assuming a top quark 
mass of 172.5 GeV/c 2 . The theoretical pair-production cross sections for the signal samples are 
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listed in Tabled] 

The mass of the Higgs boson in the signal samples is set to 120 GeV/c 2 , as the samples were 
produced before the discovery of the Higgs boson. The branching fractions of the Higgs boson 
decays are corrected to the expected values for a Higgs boson with a mass of 125 GeV/c 2 using 
the recommendations from Ref. Ii30fl . The difference between the actual mass of the Higgs boson 
(125 GeV/c 2 ) and the simulated mass (120 GeV/c 2 ) has no impact on the analysis results. 

The tt background sample is generated with POWHEG vl.O l MT - !33l interfaced to PYTHIA 6.426 
|34| to simulate the parton shower and hadronisation. All other background samples and the 
signal samples are simulated with MadGraph 5.1 |[~35l , interfaced with PYTHIA 6.426. The 
CTEQ6L1 ||36l PDF set is used with MadGraph, while the POWHEG samples have been pro¬ 
duced with CTEQ6M. For PYTHIA, the Z2* tune is used to simulate the underlying event Il37l . 

Simulated QCD multijet samples are used to validate the estimation of this background from 
data. These samples are simulated with MADGRAPH in the same way as the other background 
samples described above. 

4 Event reconstruction 

Tracks are reconstructed using an iterative tracking procedure ||24j. The primary vertices are 
reconstructed with a deterministic annealing method ll38l from all tracks in the event that are 
compatible with the location of the proton-proton interaction region. The vertex with the high¬ 
est lJhj‘' ck ) 2 is defined as the primary interaction vertex, whose position is determined from 
an adaptive vertex fit |39j. 

The particle-flow event algorithm II40U41 1 reconstructs and identifies each individual particle 
with an optimized combination of information from the various elements of the CMS detector. 
The energy of photons is directly obtained from the ECAL measurement, corrected for zero- 
suppression effects. The energy of electrons is determined from a combination of the electron 
momentum at the primary interaction vertex as determined by the tracker, the energy of the 
corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially com¬ 
patible with originating from the electron track. The energy of muons is obtained from the 
curvature of the corresponding track. The energy of charged hadrons is determined from a 
combination of their momentum measured in the tracker and the matching ECAL and HCAL 
energy deposits, corrected for zero-suppression effects and for the response function of the 
calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the 
corresponding corrected ECAL and HCAL energy. 

For each event, hadronic jets are clustered from these reconstructed particles with the infrared 
and collinear-safe anti-/r t algorithm or with the Cambridge-Aachen algorithm (CA jets) Il42l . 
The jet momentum is defined to be the vector sum of all particle momenta in this jet, and is 
found in the simulation to be within 5% to 10% of the true momentum over the whole pj 
spectrum and detector acceptance. Jet energy corrections are derived from the simulation, and 
are confirmed with in situ measurements using the energy balance of dijet and photon+jet 
events H43ll . The jet energy resolution amounts typically to 15% at lOGeV, 8% at 100 GeV, and 
4% at 1 TeV, to be compared to about 40%, 12%, and 5% obtained when the calorimeters are 
used alone for jet clustering. 

The jets contain neutral particles from additional collisions within the same beam crossing 
(pileup). The contribution from these additional particles is subtracted based on the average 
expectation of the energy deposited from pileup in the jet area, using the methods described in 
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Ref. 1441 . 

For the identification of b jets, the combined secondary vertex (CSV) algorithm is used and 
the medium operating point (CSVM) is applied Il45l . With this operating point the b tagging 
efficiency is 70% and the light flavour jet misidentification rate is 1% in tt events. This algorithm 
uses information from reconstructed tracks and secondary vertices that are displaced from the 
primary interaction vertex. The information is combined into a single discriminating variable. 
The same b tagging algorithm is used in boosted topologies and the corresponding efficiencies 
and misidentification rates are tested in the relevant samples. More details on b tagging in 
boosted topologies are given in Section [6] 

5 Analysis strategy 

Event selection criteria that make use of novel jet substructure methods are applied to reduce 
the large background contributions from QCD multijet and tt events in the analysis. The jet 
substructure methods are described in detail in Section [6] and the event selection criteria are 
summarized in Section [7] 

Two variables are used to distinguish signal from background events after the event selection. 
These variables are Hj and the invariant mass m bb of two b-tagged subjets in Higgs boson 
candidate jets. High Hj values characterize events with large hadronic activity as in the case of 
signal events. 

The shape and normalization of the Hj and m bb distributions of QCD multijet events in this 
analysis are derived using data in signal-depleted sideband regions. The sideband regions are 
defined by inverting the jet substructure criteria. Closure tests are performed with simulated 
QCD events to verify that the method predicts the rates and shapes of Hj and ;» bb accurately. 
The background determination is discussed in detail in Section [8] 

The Hj and m bb variables are combined into a single discriminator that enhances the sensitiv¬ 
ity of the analysis. This combination is performed using a likelihood ratio method, which is 
described in Section Hol 

Two event categories are used in the statistical interpretation of the results: a category with a 
single Higgs boson candidate and a category with at least two Higgs boson candidates. These 
are denoted as single and multiple H tag categories. They are chosen as such to be statisti¬ 
cally independent and are combined in setting the final limit. For the multiple H tag category, 
the Higgs boson candidate with the highest transverse momentum is used in the likelihood 
definition. The procedure of the limit setting is discussed in detail in Section [lO] 


6 Jet substructure methods 

Because of the large mass of the T quarks, the top quarks and Higgs bosons from T quark de¬ 
cays would have significant Lorentz boosts. Daughter particles of these top quarks are there¬ 
fore not well separated. In many cases all of the top quark decay products are clustered into a 
single, large jet by the event reconstruction algorithms. The approximate spread of a hadronic 
top quark decay can be determined on simulated events from the A R distances between the 
quarks produced during its decay. The four-momenta of the two quarks with the smallest A R 
distance, AR(qi, qz), are vectorially summed and the A R distance between the vector sum and 
the third quark, AR(qi + 2 /q 3 )/ is evaluated. The maximum distance between AR(qi, q 2 ) and 
AR(qi + 2 , q 3 ) indicates the approximate size A/%, ; needed to cluster the entire top quark decay 
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within one single CA jet. For the boosted decays of a Higgs boson in H —» bb events, the cor¬ 
responding quantity can be defined as the angular distance AR bb = V (A//) 2 + (A (p ) 2 between 
the two generated b quarks. Figure [l] shows the distributions of these quantities plotted as a 
function of the transverse momentum of the top quark and of the Higgs boson, generated from 
the decay of a T quark with a mass of 1000 GeV/c 2 . This shows that, for large transverse mo¬ 
menta, and hence for large T quark mass values, the decay products from Higgs bosons and top 
quarks are generally collimated and are difficult to separate using standard jet reconstruction 
algorithms. 

The approach adopted by this analysis is to apply the CA algorithm using a large size parame¬ 
ter R = 1.5, in order to cluster the decay products from top quarks and Higgs bosons into single 
large CA jets, using an implementation based on FastJet 3.0 H27l . To identify these so called 
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Figure 1: The distribution of the angular distance Abetween the three top quark decay 
products as a function of the top quark pj for simulated T quark events with a T quark mass 
of 1000 GeV/c 2 (left). Distribution of the angular distance AR bb of the two generated b quarks 
from Higgs boson decays versus the Higgs boson pj, for the same event sample (right). 


"top jets" and "Higgs jets", the analysis uses dedicated jet substructure tools, in particular a t 
tagging algorithm and a H tagging algorithm that relies on b tagging of individual subjets. A 
more detailed description of these algorithms is provided in the following sections. 


6.1 Subjet b tagging and H tagging 

It is not possible to identify b jets in boosted top quark decays using the standard CMS b tagging 
algorithms, since these are based on separated, non-overlapping jets. For dense environments 
where standard jet reconstruction algorithms are not suitable, two dedicated b tagging concepts 
have been investigated: 

• tagging of CA jets, reconstructed using a distance parameter of 0.8 (CA8 jets) or 1.5 
(CA15 jets). The 0.8 and the 1.5 jet size parameters are used because they have been 
found to provide optimal performance for large and for intermediate boost ranges, 
respectively, as discussed in the following sections. 

• tagging of subjets that are reconstructed within CA jets. 

The subjets of CA15 jets are reconstructed using the "filtering algorithm" Ill6ll , splitting jets 
into subjets based on an angular distance of R = 0.3. Only the three highest pj subjets are 
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Figure 2: Performance of the CSV b tagging algorithm in simulated events with CA15 jets and 
with subjets within the same CA15 jet. The misidentification probability for inclusive QCD 
jets is shown versus the b tagging efficiency for boosted top quarks originating from T quark 
decays, for CA15 jet transverse momentum ranges of (left) 200 < px < 400 GeV/c and (right) 
800 < p T < 1000 GeV/c. 


retained. This filtering algorithm has been found to provide the best mass resolution for CA15 
jets compared to the jet pruning Il46l and trimming ||47| algorithms. The pruning, trimming, and 
filtering algorithms are often referred to as jet grooming algorithms and their main purpose is 
to remove soft and wide-angle radiation as well as pileup contributions. Subjets of CA8 jets are 
reconstructed using the pruning algorithm, which is found to give the best performance for the 
reduced jet size. 

For the application of b tagging to CA jets, tracks in a wide region around the jet axis are 
considered. The association region corresponds to the size of the CA jet. For the application 
of b tagging to subjets, tracks in a region of A R < 0.3 around the subjet axis are used by the b 
tagging algorithm. This is the cone size employed by the standard CMS b tagging algorithms, 
and has also been found to give good performance for subjet b tagging. 

The advantage of subjet b tagging is that it allows two subjets within a single CA jet to be 
identified as b jets. This is the main component of the H tagging algorithm that distinguishes 
between boosted Higgs bosons decaying to bb and boosted top quarks. 


6.1.1 Algorithm performance 

Figure [2] shows the performance of subjet b tagging compared to CA15 jet b tagging for events 
with boosted top quarks that originate from T quark decays. The choice of the clustering al¬ 
gorithm and the cone size is driven by the t tagging algorithm, described in Section 6.2 The 


b tagging efficiency is plotted versus the misidentification probability for inclusive QCD jets. 
Two different regions of transverse jet momentum are shown. It can be seen that subjet b tag¬ 
ging outperforms the CA15 jet b tagging. 


For the identification of boosted Higgs bosons, two subjets must be b tagged and their invariant 
mass must be greater than 60 GeV/c 2 . Both CA8 jets and CA15 jets are considered. The perfor¬ 
mance of the H tagging algorithm is shown in Fig.[3]for two different regions of transverse jet 
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Figure 3: Performance of different H tagging algorithms in simulated signal events, with a 
signal mass hypothesis of 1000GeV/c 2 . The misidentification probability for inclusive QCD 
jets is shown versus the tagging efficiency for boosted Higgs boson decays, for jet transverse 
momentum ranges of (left) 150 < px < 300 GeV/c and (right) 300 < px < 500 GeV/c. Different 
b tagging options are compared: standard b tagging of AK5 jets, subjet b tagging of CA15 and 
CA8 jets, and b tagging of CA15 jets and CA8 jets. For the case of subjet b tagging, two subjets 
are required to pass the b tagging criteria. Similarly, two AK5 jets are required to pass the b 
tagging criteria for standard b tagging. 


momentum. The tagging efficiency is shown versus the misidentification probability for inclu¬ 
sive QCD jets. Figure|4]shows the performance obtained when evaluating the misidentification 
probability from tt events. The performance of the standard b tagging algorithm based on AK5 
jets is also shown. A CA15 jet is considered as satisfying the H tagging requirement if two AK5 
jets satisfy the b tagging requirement and have a A R distance <1.1 from the CA15 jet. Overall, 
subjet b tagging is found to provide better performance than b tagging based on AK5 jets. The 
choice of the optimal CA jet size parameter R depends on the px region considered. A size of 
R = 1.5 is found to be optimal for most signal mass hypotheses and is chosen for the analysis. 


6.1.2 Scale factors 

The subjet b tagging efficiency has been measured in data using a sample of semileptonic tt 
events. Scale factors have been derived to correct the efficiency predicted by simulation to that 
measured in data. The "flavor-tag consistency" (FTC) method Il45j has been used to measure 
these scale factors. The FTC method requires consistency between the number of b-tagged jets 
in data and simulation for boosted top quark events. A maximum likelihood fit is performed 
in which the b tagging efficiency scale factor SFj, and the tt cross section are free parameters. 
Usually the light flavour misidentification scale factor SFn ,, h t is fixed to a value obtained inde¬ 
pendently, but in this case the simultaneous fit of Sh^ht, SFj,, and the tt cross section has been 
performed for the first time. This method relies on simulation for the flavour of the subjets. A 
systematic uncertainty of 2% in the subjet flavour composition is taken into account. 

The FTC method is applied to three different px regions of the CA15 jet: 150 < px < 350 GeV/c, 
px > 350 GeV/c, and px > 450 GeV/c. No significant deviation of the scale factors for the three 
different samples is observed. Both the scale factors SFj, and SF| jght are found to be in agreement 
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Figure 4: Performance of different H tagging algorithms in simulated signal events, with a sig¬ 
nal mass hypothesis of 1000GeV/c 2 . The misidentification probability for the tt background 
is shown versus the tagging efficiency for boosted Higgs boson decays, for jet transverse mo¬ 
mentum ranges of (left) 150 < pj < 300 GeV/c and (right) 300 < pj < 500GeV/c. Different b 
tagging options are compared: standard b tagging of AK5 jets, subjet b tagging of CA15 and 
CA8 jets, and b tagging of CA15 jets and CA8 jets. For the case of subjet b tagging, two subjets 
are required to pass the b tagging criteria. Similarly, two AK5 jets are required to pass the b 
tagging criteria for standard b tagging. 


with the scale factors measured for standard b tagging of AK5 jets in the non-boosted regime. 


The efficiency of the invariant mass selection requirement for the two b-tagged subjets of the 
Higgs boson candidate is validated with a sample of semileptonic tt events. Since no sample of 
Higgs bosons decaying into b quark pairs can be obtained in data, the validation procedure is 
based on the selection of a pure sample of W bosons. 


The selection of semileptonic tt events requires a muon and a b-tagged AK5 jet. In addition, 
one CA15 jet is required to be selected by the t tagging algorithm (see Section 6.21. The t- 
tagged jet must have exactly one b-tagged subjet. The two subjets that are not b-tagged are 
used to calculate the invariant mass of a W boson candidate. The distribution of the W boson 
candidate mass is shown in Fig. [5] The shape of the W boson candidate mass distribution is 
the same in data and simulation and no additional scale factors or systematic uncertainties are 
assigned. 


6.2 t tagging 

The HEPTOPTAGGER algorithm, described in Ref. |jl9L is applied based on the implementation 
in FASTjET 3.0 Il27l . The algorithm uses CA15 jets as input. This choice of jet size is suitable 
for the region of phase space with intermediate boosts (with a jet pj slightly above 200 GeV/c). 
When the T quark mass is below 1TeV/c 2 , a considerable fraction of the decay products pop¬ 
ulate the intermediate boost range. Such resolved events could in principle be reconstructed 
with standard methods using AK5 jets. The HEPTOPTAGGER provides a seamless transition 
between the non-boosted and boosted domains. 

For each jet, the HEPTOPTAGGER analyses the substructure by stepping backward through 
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19.7 ft)- 1 (8 TeV) 



Figure 5: Distribution of the invariant id-subjet mass of a hadronically decaying W boson ob¬ 
tained from a semi-leptonic tt sample. The lower panel shows the ratio of data and simulation. 
The hatched area indicates the uncertainty in the signal and background cross sections. 


the clustering history of the jet in an iterative procedure until the conditions for splitting are 
no longer fulfilled and the subjets are not split any further. The filtering algorithm is applied 
to each combination of three subjets that are found. The filtering algorithm reclusters the con¬ 
stituents with a variable distance parameter Kfii t = min(0.3,ARy/2), where i and j are the 
closest subjets in A R in the subjet triplet. The five reclustered subjets with the largest pj are 
retained and the sum yields the invariant mass of the top quark candidate. The configura¬ 
tion that has an invariant mass closest to the top quark mass is chosen. The constituents of the 
five leading reclustered subjets are further reclustered using the exclusive CA algorithm, which 
forces the jet to have exactly three final subjets. The HEPTopTagger uses these three final sub¬ 
jets and selects top quark jets based on the pairwise and three-way subjet masses. Selections 
are applied in the two-dimensional plane defined by the ratio Z/Z 23 / m 123 and the arctangent of 
mis/ zzz 12 - Here 11123 is the pairwise mass of the second and third leading subjets. The variables 
ZTZ 12 , ZZZ 13 , and ZZZ 123 are defined in a similar fashion. The distribution of events in this plane is 
shown for simulated tt events in Fig. [ 6 ] (left) and for a mixture of background (boson+jets, di¬ 
boson, single top quark, tt all-hadronic, and tt leptonic) events in Fig. ^ (right). A region with 
a well enhanced structure is only present for tt events. The region is highlighted by the thick 
black lines in Fig. [ 6 ] This structure can be used to suppress backgrounds that do not contain 
boosted top quarks by rejecting events that lie outside of this region. Additionally, a selection 
on the top candidate mass, 140 < ZZZ 123 < 250GeV/c 2 , is applied. Another populated region 
shows up below and to the left of the selected region because of unmerged top decays. This 
contribution disappears for boosted top quarks above pj > 300GeV/c. 
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7 Event selection 
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Figure 6: Two-dimensional distributions of versus arctan(mi 3 /mi 2 ) for HEPTOP¬ 

TAGGER jets in simulated tt events (left) and in simulated background events (right). The sim¬ 
ulated background consists of boson+jets, di-boson, single top quark, tt all-hadronic, and tt 
leptonic. The area enclosed by the thick solid lines denotes the region selected by the HEP¬ 
TopTagger. 


6.2.1 Algorithm performance 

The selection criteria used in the algorithm are varied iteratively and the efficiency and mistag 
rate are calculated for each iteration. The minimum mistag rate for a given signal efficiency is 
shown in Fig. [7j The HEPTOPTAGGER curve is determined by fixing the in j 23 selection (140 < 
ni\ 23 < 250 GeV/c 2 ) and varying the width of the region selected by the algorithm. The other 
curve is obtained by applying simultaneously the HEPTOPTAGGER and the subjet b tagging 
criteria and varying their requirements. Details of these selection criteria are given in Ref. Il48l . 
Three working points are defined as indicated by markers in the figure. The working point used 
in this analysis is WP2, which is defined by the standard HEPTOPTAGGER criteria in addition 
to a b-tagged subjet identified with the CSVM b tagging algorithm. The other working points 
(WP1 and WP0) use relaxed HEPTopTagger criteria and relaxed b tagging, and are used to 
validate the scale factor measurements which are described in the following section. 

6.2.2 Scale factors 

A semileptonic tt sample is used to study boosted hadronic top quark decays in data. This 
sample is then used to measure data to simulation scale factors for the t tagging efficiency 
using WP2. This procedure was introduced in Ref. It20l . The tt sample is defined by requiring 
one muon and at least one b-tagged AK5 jet. Additionally, a top quark candidate CA15 jet 
is required, with high transverse momentum pj > 200 GeV/c and with at least one b-tagged 
subjet. This semileptonic selection is very pure and background contributions are negligible. 
The efficiency of the HEPTOPTAGGER is determined as the fraction of top quark candidate 
CA15 jets that pass all of the tagging requirements. These measurements yield scale factors 
ranging from 0.85 to 1.15 depending on the pj and the // of the jet. 


7 Event selection 

The Hj variable used in the analysis is calculated from the transverse momenta of all subjets 
within the reconstructed CA15 jets with pj > 150 GeV/c. This definition is more accurate 
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Figure 7: Mistag rate versus t tagging efficiency for the HEPTOPTAGGER and the combination 
of the HEPTOPTAGGER with subjet b tagging, for CA15 jets matched to generated partons with 
pj > 200 GeV/c. The mistag rate is obtained from simulated QCD multijet events, while the 
efficiency is determined using simulated tt events. 

than that used in the trigger because particle-flow reconstruction is exploited. A threshold of 
Hj > 720 GeV/c is applied in the offline analysis as the trigger is almost fully efficient above 
this value. The simulation is corrected to match the data by weighting events based on the ratio 
between the trigger efficiency calculated in data and in simulation. The systematic uncertainty 
introduced by this procedure is discussed in Section [9] 

The full event selection requires the following criteria to be fulfilled: 

• At least one CA15 jet must be t-tagged by the HEPTOPTAGGER algorithm and must 
contain at least one b-tagged subjet (identified by the CSV b tagging algorithm at the 
medium operating point). The t-tagged jets must have pj > 200 GeV/c. 

• At least one CA15 jet must have pj > 150 GeV/c and must be H-tagged (at least two 
subjets identified by the CSVM b tagging algorithm). The invariant mass of the two 
b-tagged subjets has to be larger than 60 GeV/c 2 . This jet must not be identical to the 
top-quark candidate jet. 

As mentioned in Section [5j the event selection is split further into two categories: single and 
multiple H tags. 

The number of reconstructed CA15 jets predicted by simulation with pj > 150 GeV/c is shown 
in the left plot of Fig. [8j while the right plot shows the number of jets passing the t tagging 
criteria. In the following figures the hatched regions indicate the statistical uncertainty in the 
simulated background. The signal hypotheses are represented by the solid and dashed lines. 

The impact of subjet b tagging is visible in Fig. [9] The left plot shows the number of t-tagged 
CA15 jets with a subjet b tag, while the right plot shows the number of H-tagged jets for events 
that have at least one t-tagged CA15 jet with a subjet b tag. These figures demonstrate the 
strong reduction of QCD multijet background by the jet substructure criteria. 

The number of selected events for each signal sample of the benchmark model and the selection 
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Figure 8: Left: multiplicity of CA15 jets with pj > 150 GeV/c. Events with at least two of 
these jets are selected. Right: multiplicity of CA15 jets with pj > 200 GeV/c, that are selected 
by the HEPTOPTAGGER algorithm. The solid histograms represent the simulated background 
processes (tt and QCD multijet). The hatched error bands show the statistical uncertainty of 
the simulated events. 
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TAGGER and contain a b-tagged subjet, after requiring at least one jet per event to be selected 
by the HEPTOPTAGGER algorithm. Right: multiplicity of CA15 jets with pj > 150 GeV/c sat¬ 
isfying the H tagging criteria. Events with three or more H tags are included in the bin with 
two H tags. The solid histograms represent the simulated background processes (tt and QCD 
multijet). The hatched error bands show the statistical uncertainty of the simulated events. 
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Table 1: Cross section, expected numbers of selected events, and the selection efficiencies for 
several signal samples with different values of the T quark mass for an integrated luminosity 
of 19.7 fb _1 . The signal samples assume 13(T —* tH) = 100%. The efficiencies are calculated 
relative to an inclusive sample with no requirements on top quark or Higgs boson decay modes, 
and without any selection criteria applied. 


T quark mass 
(GeV/c 2 ) 

production 
cross section (pb) 

expected 

events 

selection 

efficiency 

500 

0.59 

283.0 

2.5% 

600 

0.174 

152.0 

4.4% 

700 

0.059 

69.3 

6.0% 

800 

0.021 

30.3 

7.2% 

900 

0.0083 

12.1 

7.3% 

1000 

0.0034 

4.9 

7.2% 


efficiencies, derived from simulated events, are given in Table [TJ 


8 Background estimation 


The tt background is evaluated from simulated events, corrected for differences between data 
and simulation in b tagging and trigger efficiencies described above. The uncertainties in the 
normalization and shape of tt events are discussed in Section[9] Background contributions from 
ttH and hadronically decaying W/Z plus heavy flavour processes are found to be below 1% 
and are neglected. 

The QCD multijet background is estimated in data using a two-dimensional sideband extrap¬ 
olation. In this method, two uncorrelated criteria in the event selection are inverted to obtain 
sideband regions that are enriched in QCD multijet events and depleted in signal events. In¬ 
verting each criterion individually, as well as both at the same time, results in three exclusive 
sideband regions, denoted A, B and C: 


• Sideband region B is obtained by inverting the selection criteria of the HEPTOP- 
TAGGER algorithm. The top quark mass window as well as all requirements on the 
pairwise subjet mass in the HEPTOPTAGGER are inverted. Events outside of the 
selected region shown in Fig. [6] (Section [6]) are used to define the inverted HEPTOP- 
TAGGER control region, while the events that are inside define the signal region. 
Details of these selection criteria of the HEPTOPTAGGER are given in Section [6] and 



• Sideband region C is obtained by inverting the H tagging algorithm. Only events 
with zero H tags are selected and the requirement on the pairwise subjet mass is 
removed. 

• Sideband region A is obtained by inverting both the H tagging and the t tagging 
algorithms as described above. 

• Events in the signal region D have all tagging requirements applied. 

The tt contamination in the sideband regions amounts to a maximum of 8% in region C. This 
is accounted for by subtracting the tt contribution predicted by the simulation in each of the 
sideband regions. Backgrounds due to ttH and hadronically decaying W/Z plus heavy flavour 
processes are found to have a negligible contribution in the sideband regions. A signal injection 
test has been performed to evaluate the impact of a hypothetical signal on the background 
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8 Background estimation 


model. It has been found that the signal contamination in the sideband regions leads to a small 
effect of less than 1.4% for nij = 700 GeV/c 2 on the measured QCD multijet event rate, and 
therefore the possible signal contamination in the sideband regions is neglected in the analysis. 

The QCD multijet yield in the signal region is calculated as 

Rd=Rb^, ( 1 ) 


where Ra denotes the rate of events in sideband A. The tt contamination in the sideband re¬ 
gions is subtracted. The event rates in the three sideband regions and the signal region are 
provided in Table [2] The resulting predictions of the QCD multijet backgrounds are given in 
Table |3]for the two event categories. 


The closure of this method is verified with simulated QCD multijet events. As the method 
assumes the selection criteria defining the sideband regions to be uncorrelated, the following 
condition must be fulfilled: 


Ra _ Rc 
Rb Rd 


( 2 ) 


According to simulation, the ratios are Ra/Rb = 185 ± 5 (1417 ± 97) and Rq/Rd = 185 ± 17 
(1203 ± 250) for the single (multi) H tag event category. The quoted uncertainties are statistical. 
It can be seen that the ratios agree within the statistical uncertainties. The largest uncertainties 
occur in the Rq/Rd ratio and are about 10 (20)% for the single (multi) H tag category. 


In addition to the event yields, the shapes of the Hj and rn bb distributions for the QCD mul¬ 
tijet processes are also derived from the sideband regions. For both the Hj and rn bb variables 
the sideband region B (inverted t tagger) is used. The expected contribution from tt events is 
subtracted from the sideband. 


Closure is also verified for the shape of Hj and rn bb distributions in the signal and sideband 
regions. Figure [lO] shows a comparison of the Hj and m bb shapes in the sideband and signal 
regions for the single and the multiple H tag event categories. The distributions agree within 
statistical uncertainties. 


The method has also been validated in data. The shapes of the simulated Hj and m bb distri¬ 
butions in the signal region agree well with the predicted distributions in data. The absolute 
rate of events shows a disagreement between simulation and the data-derived rate of a factor 
of two. This disagreement is taken into account when assigning systematic uncertainties in the 
background, as explained in Section [9] 

Table 2: Event rates in the signal and sideband regions obtained from the two-dimensional 
sideband extrapolation in data for the two H tag categories. The tt contamination is subtracted 
from the nominal yield in the sideband regions. The prediction of the QCD multijet event rate 
in the signal region D is given along with statistical uncertainties that arise from the limited 
size of event samples in the sideband regions. The sideband regions A and C are common to 
both H tag multiplicity categories. 




single H tag category 

multi H tag category 

region A 

region B 

region B 

data 

data — tt 

1152640 

1146464 

data 

data — tt 

8384 

8089 

data 

data — tt 

1157 

1123 

region C 

region D 

region D 

data 

data — tt 

140911 

129972 

prediction 

917 ±11 

prediction 

127 ±4 
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Figure 10: Comparison of the Hj (left) and m b , (right) distributions in the sideband region B 
and signal region for the single (top) and multiple (bottom) H tag event categories for simulated 
QCD multijet events. All distributions are normalized to unity for shape comparison. The 
lower panels in the figures show the ratio of the signal and sideband regions. 


9 Systematic uncertainties 

As the analysis relies on simulation for the tt background prediction, a careful evaluation of 
uncertainties affecting both the normalization and shape of the tt background events is needed. 
This is also required for the simulated signal events. 

The QCD multijet background is obtained from data. The rate and shape of the tt background 
have an effect on the measurement of the QCD multijet background because the tt contamina¬ 
tion in the sideband region is subtracted from data. 


Table 3: Predicted background contributions in the signal region for the two event categories 
with one and with multiple H tags. Statistical uncertainties in the background estimates are 
also shown. 



single H tag category 

multi H tag category 

QCD (predicted from data) 

917 ±11 

127 ±4 

tt (from simulation) 

486 ±8 

55 ±3 

total background 

1403 ± 14 

182 ±5 

data 

1355 

205 
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9 Systematic uncertainties 


The detailed list of systematic uncertainties is given below. Most of these uncertainties have an 
impact on both the shapes and normalization of the sensitive variables Hj and m bb , while the 
uncertainty in the integrated luminosity only affects the normalization. The uncertainties are 
summarized in Table HJ 

• b tagging scale factor uncertainties: based on the measurements described in Sec- 
tion |6.1| and Ref. |49l , scale factors with their corresponding uncertainties are applied 
to simulated samples. The scale factor uncertainties for the b tagging efficiency de¬ 
pend on pj and //. The typical size of these uncertainties is between 1 and 2% while 
the mistag rate uncertainty is around 15%. The b tagging scale factor uncertainties 
affect both the normalization and shape of the tt background and signal events. De¬ 
pending on the sample and signal mass point, the impact of the b tagging scale factor 
uncertainty on the expected number of selected signal and tt events is 5 to 8% while 
the impact of the mistag scale factor uncertainty is 0.3 to 4%. 

• HEPTopTAGGER scale factor uncertainty: the efficiency of the HEPTOPTAGGER has 
been measured and compared to simulation to derive scale factors as described in 
Section 16721 The uncertainties in these scale factor measurements are between 3 and 
6%, and are parameterized as a function of pj. These uncertainties affect both the 
normalization and shape of the tt background and signal events. The impact on the 
expected number of signal and tt events is 0.4 to 2.3%. 

• Jet energy corrections: dedicated energy corrections for CA15 jets are not available. 
Therefore, the energy corrections for jets reconstructed with the anti -kj algorithm 
with size parameter R = 0.7 (AK7) Il26l have been used j43l . It has been verified 
that these corrections are valid by comparing the reconstructed jets in simulation to 
the corresponding generator level jets where exactly the same clustering and groom¬ 
ing algorithms have been applied. The ratio between reconstructed and generated 
momentum for these jets is found to be consistent with unity, with variations that 
are less than 4%. The impact of the uncertainty on the jet energy scale of filtered 
CA15 jets is evaluated by varying the jet four-momentum up and down by the jet 
energy scale uncertainties of AK7 jets, with an additional 4% systematic uncertainty. 

The uncertainty in the subjet energy scale is assumed to be similar to the energy 
scale uncertainty of AK5 jets. The impact on the expected number of selected tt and 
signal events is less than 0.5% for CA15 jets and less than 5% for subjets. 

• PDF uncertainties: simulated tt events are weighted according to the uncertainties 
parameterized by the CTEQ6 eigenvectors 11361 . The shifts produced by the individ¬ 
ual eigenvectors are added in quadrature in each bin of the Hj and m bb distributions. 

The resulting uncertainty in the number of expected tt events ranges from 2.4 to 8%. 

• Scale uncertainties: the impact of the renormalization and factorization scale uncer¬ 
tainties on the tt simulation has been studied using tt event samples generated with 
two different values of these scales (moving them simultaneously up or down by a 
factor of two relative to the nominal value). It has been verified that this uncertainty 
has no impact on the shapes of Hj and m bb distributions within the statistical uncer¬ 
tainties of the simulated samples. The resulting impact on the selected number of tt 
events is 34%. 

• QCD multijet background normalization: the normalization and shape of QCD mul¬ 
tijet events do not show any discrepancy between the predicted and observed shapes 
in the signal region based on the closure test with simulated events, as discussed in 
Section[8] The comparison of the simulated sidebands with data shows a very good 
agreement of the shapes as well, but the normalization is not in agreement. There- 
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Table 4: Systematic uncertainties and their effect on signal and background processes, ex¬ 
pressed in percent. The uncertainties are described in detail in Section [9] This table shows 
uncertainties in the normalization only. 


uncertainty 

tt 

QCD multijet 

signal 

500 GeV/c 2 

signal 

700 GeV/c 2 

signal 

1000 GeV/c 2 

b tagging: 

heavy flavour 

±9.2/-7.5 

— 

±6.0/—6.8 

±7.1/—6.5 

±7.8/—8.0 

light flavour 

±4.2/-3.2 

— 

±1.2/-0.7 

±0.9/—0.6 

±0.8/ —1.0 

HEPTopTagger 

±0.9/-0.4 

— 

±1.6/—1.7 

±1.7/—1.8 

±1.8/—2.3 

jet energy corrections 

±5.0/-4.1 

— 

±3.7/—2.8 

±0.7/—0.7 

±0.1/—0.4 

scale uncertainties 

±34 

— 

— 

— 

— 

PDF 

±8.0/-4.4 

— 

— 

— 

— 

trigger scale factors 

±3.6/-4.0 

— 

±2.3/—2.3 

±0.7/—0.7 

±0.06/-0.08 

luminosity 

±2.6 

— 

±2.6 

±2.6 

±2.6 

tt cross section 

±13 

— 

— 

— 

— 

background estimate: 

single H tag 

— 

±10 

— 

— 

— 

multi H tag 

— 

±20 

— 

— 

— 


fore a systematic uncertainty in the normalization of QCD multijet events is taken 
into account. This uncertainty is derived from the statistical precision of the closure 
test, which is limited by the finite size of simulated event samples. The uncertainty 
in the single H tag category is 10% while the uncertainty is 20% in the multi H tag 
category. The only systematic uncertainty in the shape of the QCD multijet back¬ 
ground arises from the subtraction of tt events. The effect of the tt scale uncertainty 
on the estimation of the QCD multijet background is less than 1%. Uncertainties in 
the tt simulation and the corresponding propagated uncertainties in the QCD mul¬ 
tijet prediction are treated as correlated, but they have opposite effects. 

• Trigger reweighting: a scale factor SF tr i g is applied to correct for the different be¬ 
haviour between data and simulation in the region in which the trigger is not fully 
efficient. A systematic uncertainty in the scale factor is obtained by varying SF t rig by 
±0.5(1 — SFtng). This uncertainty does not affect the plateau region of the trigger, 
where SF tr i g = 1. This uncertainty is taken into account both as a shape and as a rate 
uncertainty. It only affects the low -Hj range. The trigger efficiency is measured in 
a tt-enriched data sample. For 720 < Hj < 780 GeV/c the efficiency is 75%, with a 
SFtrig of 80%. For 780 < Hj < 840 GeV/c the trigger efficiency is 93%, with a SF tr i g of 
94%. For Hj > 840 GeV/c the trigger has an efficiency always greater than 99% and 
a SFtrig consistent with one. The overall impact of this uncertainty on the event yield 
is 3.5%. 

• Luminosity: an uncertainty in the integrated luminosity of 2.6% is taken into ac¬ 
count 11501 . 

• Cross section of the tt background: an uncertainty of 13% is assigned to the tt cross 
section. This uncertainty is obtained with the technique used in the differential tt 
cross section measurement f5ll for large invariant mass values of the tt system. 
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10 Results 


10 Results 

Figure [lT] shows the comparison between data and the expected background contributions for 
the single and multiple H tag event categories after all event selection criteria are applied. 
In the multiple H tag category only the Higgs boson candidate with the highest transverse 
momentum is used. The QCD multijet background has been derived from data as discussed 
in Section |8] Signal samples at three different mass points are also shown. In these plots only 
signal samples in which all T quarks decay into a top quark and a Higgs boson are shown. 
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Figure 11: The Hj (left) and Higgs boson candidate mass (right) distributions for the single H 
tag category (top) and the multiple H tag category (bottom). The QCD multijet background 
is derived from data. The tt background is taken from simulation. The hypothetical signal 
is shown for three different mass points: 500, 700, and 1000 GeV/c 2 . The hatched error bands 
show the quadratic sum of all systematic and statistical uncertainties in the background. In the 
ratio plot, the statistical uncertainty in the background is depicted by the inner central band, 
while the outer band shows the quadratic sum of all systematic and statistical uncertainties. 


Based on the expected distributions for the background and signal models for Hj and m bb , a 
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Figure 12: Discriminating variable L constructed from both Hj and m,, for the single (top) 
and the multiple (bottom) H tag categories. The three signal hypotheses with 500, 700, and 
1000GeV/c 2 are shown on the left, middle, and right, respectively. The QCD multijet back¬ 
ground is derived from data. The tt background is derived from simulation. The hatched error 
bands show the quadratic sum of all systematic and statistical uncertainties in the background. 
In the ratio plot, the statistical uncertainty in the background is depicted by the inner central 
band, while the outer band shows the quadratic sum of all systematic and statistical uncertain¬ 
ties. 


discriminating quantity L is calculated for each event, where 


L = In 



^’sig(Tfi) T S jg(?n bb ) \ 
^ > back(Tfx) Pbacki^}^) ) 


(3) 


The P variables represent the probability densities for the signal or background hypotheses. 
The Pback values are obtained from the sum of the simulated tt and QCD multijet background 
distributions because other background contributions are found to be negligible, as discussed 
in Section [8] For the signal hypothesis, the T S i g values are obtained from simulated Hj and 
m hb distributions for each signal mass point. A binned likelihood method is used where the 
values for the P variables are taken from histograms. The distribution of this variable is shown 
in Fig. [12] for data compared to the background prediction and signal hypotheses, for both the 
single and multiple H tag categories. As the signal model is included in the discriminator, each 
signal mass hypothesis has its own definition of L. The mass points 500, 700, and 1000 GeV/c 2 
are shown in these figures. The spikes in these distributions are due to the likelihood definition, 
that is obtained by taking values from binned distributions. 

No signal-like excess is observed in data. Bayesian upper limits l52l on the T quark produc¬ 
tion cross section are obtained with the Theta framework [53). The nuisance parameters are 
assigned to the sources of systematic uncertainties reported in Section [9j which are taken into 
account as global normalization uncertainties and as shape uncertainties where applicable. The 















































































20 


10 Results 



m T (GeV/c 2 ) 


Figure 13: Observed (solid line) and expected (dotted line) Bayesian upper limits on the T 
quark production cross section determined from the variable L for the combination of the single 
and multiple H tag categories, for the hypothesis of an exclusive branching fraction 0(T —> 
tH) = 100%. The green (inner) and yellow (outer) bands show the 1 a (2 a) uncertainty ranges, 
respectively The dashed line shows the prediction of the theory as discussed in Section |3j 


shape uncertainties are taken into account by interpolating between the nominal and ±1 a 
templates of the likelihood distributions. Figure 13 shows the observed and expected limits 
on the T pair production cross section, for the hypothesis of an exclusive branching fraction 
B{T —> tH) = 100% using the combination of both the single and multiple H tag event cat¬ 
egories. T quarks exclusively decaying into tH and with mass values below 745 GeV/c 2 are 
excluded at 95% confidence level (CL), with an expected exclusion limit of 773 GeV/c 2 . Due to 
the lower background contamination, the multiple H tag event category provides the largest 
contribution to the achieved sensitivity. 


In evaluating limits, the other decay modes of the T quark must be considered. For mixed 
branching fractions there are six distinct final states: tHtH, tHtZ, tHbW, bWbW, bWtZ, tZtZ. 
Three of these final states contain at least one tH decay. This means that the single H tag 
category of this analysis is sensitive also to non-exclusive branching fractions. Furthermore, 
we also expect some sensitivity to tZ decays because the mass of the Z boson differs from the 
mass of the Higgs boson by only 35 GeV/c 2 and because it decays into b quark pairs with a 
branching fraction of 15.6%. A selection efficiency of 4.5% is found for the tHtZ final state, 3% 
for tHbW, and 2% for tZtZ for a T quark mass of 800 GeV/c 2 . These efficiencies are calculated 
in the same way as those for tHtH in Table [l] 

A dedicated optimization is not performed for the non-exclusive decay modes. Nevertheless, 
exclusion limits are calculated for all branching fractions from a scan of all allowed values. 
Simulated signal samples have been produced for each set of branching fractions used in the 
scan. 


Observed and expected lower limits on the mass of the T quark for different branching frac¬ 
tions are listed in Table [5] and shown in Fig. [T4] Table [5] shows only those branching fractions 
for which actual mass limits exist (where the theory curve crosses the limit curve). A good sen¬ 
sitivity is achieved for T —> tH branching fractions down to 80%. The observed and expected 
limits on the production cross section for different branching fractions are given in Table [6] and 
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Table 5: Observed and expected lower limits on the mass of the T quark (in GeV/c 2 ) for a range 
of T quark branching fraction hypotheses listed in the first three columns. Only combinations 
for which an observed limit is found are reported. When the limit lies below the scanned mass 
region between 500 and 1000 GeV/c 2 a value of < 500 is indicated. 
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Figure 14: Branching fraction triangle with observed upper limits (left) and expected limits 
(right) for the T quark mass. Every point in the triangle corresponds to a particular set of 
branching fraction values subject to the constraint that all three add up to one. The branching 
fraction for each mode decreases from one at the corner labelled with the specific decay mode 
to zero at the opposite side of the triangle. 

shown in Fig. 
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10 Results 


Table 6: Branching fractions (first three columns) and the observed and expected upper limits 
on the cross section for different mass values of the T quark. The expected limits are quoted 
with their corresponding uncertainties while the observed limits are quoted without uncertain¬ 
ties. The cross section limits are given in units of pb, while the T quark mass values are given 
in units of GeV/c 2 . 
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Figure 15: Branching fraction triangle with observed (top) and expected (bottom) limits on the 
T quark pair production cross section for three different T quark mass hypotheses: 500 (left), 
700 (middle), and 1000 GeV/c 2 (right). Every point in the triangle corresponds to a particular set 
of branching fraction values subject to the constraint that all three add up to one. The branching 
fraction for each mode decreases from one at the corner labelled with the specific decay mode 
to zero at the opposite side of the triangle. 
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11 Summary 

A search for heavy resonances decaying to top quarks and Higgs bosons has been performed 
using proton-proton collisions recorded with the CMS detector at i/s = 8 TeV, corresponding 

_i 

to an integrated luminosity of 19.7 fb . The benchmark model considered is a heavy vector¬ 
like T quark that decays into bW, tZ, and tH in all-hadronic final states. The analysis makes use 
of jet substructure techniques including algorithms for the identification of boosted top quarks, 
boosted Higgs bosons, and subjet b tagging. Results are presented for exclusive T quark decay 
modes as well as for non-exclusive branching fractions. If the heavy T quark has a branching 
fraction of 100% for T —> tH, the observed (expected) exclusion limit on the mass of the T quark 
is 745 (773) GeV/c 2 at 95% confidence level. This limit is similar to that obtained from leptonic 
final states Itl4ll . These results are the first to exploit the all-hadronic final state in the search for 
vector-like quarks and they facilitate the combination with other analyses to improve the mass 
reach. 
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